Network analysis of named entity interactions in written texts
نویسنده
چکیده
The use of methods borrowed from statistics and physics has allowed for the discovery of unprecedent patterns of human behavior and cognition by establishing links between models features and language structure. While current models have been useful to identify patterns via analysis of syntactical and semantical networks, only a few works have probed the relevance of investigating the structure arising from the relationship between relevant entities such as characters, locations and organizations. In this study, we introduce a model that links entities appearing in the same context in order to capture the complexity of entities organization through a networked representation. Computational simulations in books revealed that the proposed model displays interesting topological features, such as short typical shortest path length, high values of clustering coefficient and modular organization. The effectiveness of the our model was verified in a practical pattern recognition task in real networks. When compared with the traditional word adjacency networks, our model displayed optimized results in identifying unknown references in texts. Because the proposed model plays a complementary role in characterizing unstructured documents via topological analysis of named entities, we believe that it could be useful to improve the characterization written texts when combined with other traditional approaches based on statistical and deeper paradigms.
منابع مشابه
سیستم شناسایی و طبقهبندی موجودیتهای اسمی در متون زبان فارسی بر پایه شبکه عصبی
Named Entity Recognition (NER) is a fundamental task in natural language processing and also known as a subset of information extraction. We seek to locate and classify named entities in text into predefined categories such as the names of persons, organizations, locations, expressions of times, etc. Named Entity Recognition for English texts has been researched widely for the past years, howev...
متن کاملبهبود شناسایی موجودیتهای نامدار فارسی با استفاده از کسره اضافه
Named entity recognition is a process in which the people’s names, name of places (cities, countries, seas, etc.) and organizations (public and private companies, international institutions, etc.), date, currency and percentages in a text are identified. Named entity recognition plays an important role in many NLP tasks such as semantic role labeling, question answering, summarization, machine ...
متن کاملRule-based Named Entity Extraction For Ontology Population
Currently, Text analysis techniques such as named entity recognition rely mainly on ontologies which represent the semantics of an application domain. To build such an ontology from specialized texts, this article presents a tool which detects proper names, locations and dates from texts by using manually written linguistic rules. The most challenging task is to extract not only entities but al...
متن کاملتشخیص اسامی اشخاص با استفاده از تزریق کلمههای نامزد اسم در میدانهای تصادفی شرطی برای زبان عربی
Named Entity Recognition and Extraction are very important tasks for discovering proper names including persons, locations, date, and time, inside electronic textual resources. Accurate named entity recognition system is an essential utility to resolve fundamental problems in question answering systems, summary extraction, information retrieval and extraction, machine translation, video interpr...
متن کاملNamed Entity Recognition in Persian Text using Deep Learning
Named entities recognition is a fundamental task in the field of natural language processing. It is also known as a subset of information extraction. The process of recognizing named entities aims at finding proper nouns in the text and classifying them into predetermined classes such as names of people, organizations, and places. In this paper, we propose a named entity recognizer which benefi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1509.05281 شماره
صفحات -
تاریخ انتشار 2015